1,749 research outputs found
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences
Speaking rate refers to the average number of phonemes within some unit time,
while the rhythmic patterns refer to duration distributions for realizations of
different phonemes within different phonetic structures. Both are key
components of prosody in speech, which is different for different speakers.
Models like cycle-consistent adversarial network (Cycle-GAN) and variational
auto-encoder (VAE) have been successfully applied to voice conversion tasks
without parallel data. However, due to the neural network architectures and
feature vectors chosen for these approaches, the length of the predicted
utterance has to be fixed to that of the input utterance, which limits the
flexibility in mimicking the speaking rates and rhythmic patterns for the
target speaker. On the other hand, sequence-to-sequence learning model was used
to remove the above length constraint, but parallel training data are needed.
In this paper, we propose an approach utilizing sequence-to-sequence model
trained with unsupervised Cycle-GAN to perform the transformation between the
phoneme posteriorgram sequences for different speakers. In this way, the length
constraint mentioned above is removed to offer rhythm-flexible voice conversion
without requiring parallel data. Preliminary evaluation on two datasets showed
very encouraging results.Comment: 8 pages, 6 figures, Submitted to SLT 201
Bandwidth allocation and pricing problem for a duopoly market
This research discusses the Internet service provider (ISP) bandwidth allocation and pricing problems for a duopoly bandwidth market with two competitive ISPs. According to the contracts between Internet subscribers and ISPs, Internet subscribers can enjoy their services up to their contracted bandwidth limits. However, in reality, many subscribers may experience the facts that their on-line requests are denied or their connection speeds are far below their contracted speed limits. One of the reasons is that ISPs accept too many subscribers as their subscribers. To avoid this problem, ISPs can set limits for their subscribers to enhance their service qualities. This paper develops constrained nonlinear programming to deal with this problem for two competitive ISPs. The condition for reaching the equilibrium between the two competitive firms is derived. The market equilibrium price and bandwidth resource allocations are derived as closed form solutions
Comparing How a Chatbot References User Utterances from Previous Chatting Sessions: An Investigation of Users' Privacy Concerns and Perceptions
Chatbots are capable of remembering and referencing previous conversations,
but does this enhance user engagement or infringe on privacy? To explore this
trade-off, we investigated the format of how a chatbot references previous
conversations with a user and its effects on a user's perceptions and privacy
concerns. In a three-week longitudinal between-subjects study, 169 participants
talked about their dental flossing habits to a chatbot that either, (1-None):
did not explicitly reference previous user utterances, (2-Verbatim): referenced
previous utterances verbatim, or (3-Paraphrase): used paraphrases to reference
previous utterances. Participants perceived Verbatim and Paraphrase chatbots as
more intelligent and engaging. However, the Verbatim chatbot also raised
privacy concerns with participants. To gain insights as to why people prefer
certain conditions or had privacy concerns, we conducted semi-structured
interviews with 15 participants. We discuss implications from our findings that
can help designers choose an appropriate format to reference previous user
utterances and inform in the design of longitudinal dialogue scripting.Comment: 10 pages, 3 figures, to be published in Proceedings of the 11th
International Conference on Human-Agent Interaction (ACM HAI'23
Modeling and Algorithm for Multiple Spanning Tree Provisioning in Resilient and Load Balanced Ethernet Networks
We propose a multitree based fast failover scheme for Ethernet networks. In our system, only few spanning trees are used to carry working traffic in the normal state. As a failure happens, the nodes adjacent to the failure redirect traffic to the preplanned backup VLAN trees to realize fast failure recovery. In the proposed scheme, a new leaf constraint is enforced on the backup trees. It enables the network being able to provide 100% survivability against any single link and any single node failure. Besides fast failover, we also take load balancing into consideration. We model an Ethernet network as a twolayered graph and propose an Integer Linear Programming (ILP) formulation for the problem. We further propose a heuristic algorithm to provide solutions to large networks. The simulation results show that the proposed scheme can achieve high survivability while maintaining load balancing at the same time. In addition, we have implemented the proposed scheme in an FPGA system. The experimental results show that it takes only few μsec to recover a network failure. This is far beyond the 50 msec requirement used in telecommunication networks for network protection
Refinement treatment of nasal bone fracture: A 6-year study of 329 patients
SummaryBackgroundThe reliability of X-ray radiography for diagnosing nasal bone fractures (NBFs) remains controversial. Recent studies show that, for determining the orientation and location of the displaced/depressed fracture, nasal sonography is as accurate as facial computed tomography. This retrospective study compared conductor-assisted nasal sonography (CANS) to conventional diagnostic tools and reported subjective patient satisfaction and discomfort after closed reduction combined with tube technique.MethodsThis retrospective study reports the results of 329 refinement treatments for nasal bone fracture (including 199 men and 130 women) performed from 2005 to 2011. All patients were assessed with CANS and completed a survey immediately prior to removing the packing. Questionnaires were adapted from the nasal obstruction symptom evaluation (NOSE) scale.ResultsThe study found that CANS has a 97.2% rate of accuracy in diagnosing NBF. The visual analog scale scores of nasal obstruction, nasal congestion, sleep disturbance, trouble breathing, and inability to move air through the nose were analyzed. The experimental group scores were significantly different from the control group for all scores (p < 0.001).ConclusionCompared to conventional methods, CANS is more accurate for detecting NBF. We recommend its use as an alternative tool for diagnosing a nasal fracture. Because the tube technique balances pressure between the nasopharynx and middle ear during swallowing, patient comfort is enhanced. Application of these modifications can improve accuracy in diagnosing NBF and can improve the quality of NBF treatment
- …